[SPARK-16786] [Python] [WIP] LDA topic distributions API Call for python#14394
[SPARK-16786] [Python] [WIP] LDA topic distributions API Call for python#14394jordy25519 wants to merge 2 commits intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
holdenk
left a comment
There was a problem hiding this comment.
@supremekai Is this something you are still interested in working on?
| JavaPairRDD.fromRDD(topicDistributions.asInstanceOf[RDD[(java.lang.Long, Vector)]]) | ||
| } | ||
|
|
||
| override def topicDistributions(documents: RDD[(Long, Vector)]): RDD[(Long, Vector)] = { |
There was a problem hiding this comment.
Is this what we want here? It seems having it defined on the parent if half of the children aren't implementing it might be confusing to some users.
There was a problem hiding this comment.
@holdenk I'm keen to work on this. definitely agree, but am not sure how else to approach this without implementing the logic for LDA distributed models.
|
@supremekai Thanks for the PR! I'm sorry about the inactivity on this. However, now that it has been added to the DataFrame-based API (in pyspark.ml), we will not be adding it to the RDD-based API. Could you please close this issue? |
Closes apache#15736 Closes apache#16309 Closes apache#16485 Closes apache#16502 Closes apache#16196 Closes apache#16498 Closes apache#12380 Closes apache#16764 Closes apache#14394 Closes apache#14204 Closes apache#14027 Closes apache#13690 Closes apache#16279 Author: Sean Owen <sowen@cloudera.com> Closes apache#16778 from srowen/CloseStalePRs.
What changes were proposed in this pull request?
Implemented python call to topicDistributions for pyspark.clustering.mllib.LDAModel
How was this patch tested?
Ran ./dev/run-tests, all passing
Manually verified.
Used function parameter types, return types etc. from existing API calls so all behaviour is consistent with existing behaviour.
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)